NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters

https://doi.org/10.1145/3716368.3735301

Moore, Hayden; Qi, Sirui; Hogade, Ninad; Milojicic, Dejan; Bash, Cullen; Pasricha, Sudeep (July 2025, ACM GLSVLSI 2025)

Free, publicly-accessible full text available July 2, 2026
HeteroBench: Multi-kernel Benchmarks for Heterogeneous Systems

https://doi.org/10.1145/3676151.3719366

Tian, Hongzheng; Mishra, Alok; Chen, Zhiheng; Hong_Enriquez, Rolando P; Milojicic, Dejan; Frachtenberg, Eitan; Huang, Sitao (May 2025, ACM)

The end of Moore’s Law and Dennard scaling has driven the proliferation of heterogeneous systems with accelerators, including CPUs, GPUs, and FPGAs, each with distinct architectures, compilers, and programming environments. GPUs excel at massively parallel processing for tasks like deep learning training and graphics rendering, while FPGAs offer hardware-level flexibility and energy efficiency for low-latency, high-throughput applications. In contrast, CPUs, while general-purpose, often fall short in high-parallelism or power-constrained applications. This architectural diversity makes it challenging to compare these accelerators effectively, leading to uncertainty in selecting optimal hardware and software tools for specific applications. To address this challenge, we introduce HeteroBench, a versatile benchmark suite for heterogeneous systems. HeteroBench allows users to evaluate multi-compute kernel applications across various accelerators, including CPUs, GPUs (from NVIDIA, AMD, Intel), and FPGAs (AMD), supporting programming environments of Python, Numba-accelerated Python, serial C++, OpenMP (both CPUs and GPUs), OpenACC and CUDA for GPUs, and Vitis HLS for FPGAs. This setup enables users to assign kernels to suitable hardware platforms, ensuring comprehensive device comparisons. What makes HeteroBench unique is its vendor-agnostic, cross-platform approach, spanning diverse domains such as image processing, machine learning, numerical computation, and physical simulation, ensuring deeper insights for HPC optimization. Extensive testing across multiple systems provides practical reference points for HPC practitioners, simplifying hardware selection and performance tuning for both developers and end-users alike. This suite may assist to make more informed decision on AI/ML deployment and HPC development, making it an invaluable resource for advancing academic research and industrial applications.
more » « less
Free, publicly-accessible full text available May 5, 2026
A Framework for SLO, Carbon, and Wastewater-Aware Sustainable FaaS Cloud Platform Management

https://doi.org/10.1109/IGSC64514.2024.00015

Qi, Sirui; Moore, Hayden; Hogade, Ninad; Milojicic, Dejan; Bash, Cullen; Pasricha, Sudeep (November 2024, IEEE IGSC 2024)

Full Text Available
MOSAIC: A Multi-Objective Optimization Framework for Sustainable Datacenter Management

https://doi.org/10.1109/HiPC58850.2023.00046

Qi, Sirui; Milojicic, Dejan; Bash, Cullen; Pasricha, Sudeep (December 2023, IEEE)

Full Text Available
SHIELD: Sustainable Hybrid Evolutionary Learning Framework for Carbon, Wastewater, and Energy-Aware Data Center Management

https://doi.org/10.1145/3634769.3634810

Qi, Sirui; Milojicic, Dejan; Bash, Cullen; Pasricha, Sudeep (October 2023, ACM)

Full Text Available
Fine-grained accelerator partitioning for Machine Learning and Scientific Computing in Function as a Service Platform

https://doi.org/10.1145/3624062.3624238

Dhakal, Aditya; Raith, Philipp; Ward, Logan; Hong Enriquez, Rolando P.; Rattihalli, Gourav; Chard, Kyle; Foster, Ian; Milojicic, Dejan (November 2023, ACM)

Full Text Available
Optimizing Post-Copy Live Migration with System-Level Checkpoint Using Fabric-Attached Memory

https://doi.org/10.1109/MCHPC49590.2019.00010

Chou, Chih Chieh; Chen, Yuan; Milojicic, Dejan; Reddy, Narasimha; Gratz, Paul (November 2019, 2019 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC))

Full Text Available
Fast in-memory CRIU for docker containers

https://doi.org/10.1145/3357526.3357542

Venkatesh, Ranjan Sarpangala; Smejkal, Till; Milojicic, Dejan S.; Gavrilovska, Ada (January 2019, MEMSYS '19: Proceedings of the International Symposium on Memory Systems)

Server systems with large amounts of physical memory can benefit from using some of the available memory capacity for in-memory snapshots of the ongoing computations. In-memory snapshots are useful for services such as scaling of new workload instances, debugging, during scheduling, etc., which do not require snapshot persistence across node crashes/reboots. Since increasingly more frequently servers run containerized workloads, using technologies such as Docker, the snapshot, and the subsequent snapshot restore mechanisms, would be applied at granularity of containers. However, CRIU, the current approach to snapshot/restore containers, suffers from expensive filesystem write/read operations on image files containing memory pages, which dominate the runtime costs and impact the potential benefits of manipulating in-memory process state. In this paper, we demonstrate that these overheads can be eliminated by using MVAS -- kernel support for multiple independent virtual address spaces (VAS), designed specifically for machines with large memory capacities. The resulting VAS-CRIU stores application memory as a separate snapshot address space in DRAM and avoids costly file system operations. This accelerates the snapshot/restore of address spaces by two orders of magnitude, resulting in an overall reduction in snapshot time by up to 10× and restore time by up to 9×. We demonstrate the utility of VAS-CRIU for container management services such as fine-grained snapshot generation and container instance scaling.
more » « less
Full Text Available

Search for: All records